Add support for new gfx1200 and gfx1201 targets#12372
Add support for new gfx1200 and gfx1201 targets#12372IMbackK merged 7 commits intoggml-org:masterfrom
Conversation
|
CC: @powderluv |
|
@JohannesGaessler Could you please update the labels because I don't have correct permissions for that: GraphQL: slojosic-amd does not have the correct permissions to execute |
ggml/src/ggml-cuda/ggml-cuda.cu
Outdated
| CUBLAS_CHECK(cublasSetStream(ctx.cublas_handle(id), stream)); | ||
|
|
||
| if (GGML_CUDA_CC_IS_CDNA(compute_capability)) { | ||
| if (GGML_CUDA_CC_IS_CDNA(compute_capability) || GGML_CUDA_CC_IS_RDNA4(compute_capability)) { |
There was a problem hiding this comment.
If V_WMMA_F32_16X16X16_F16 dose better here than V_WMMA_F16_16X16X16_F16 on rdna4 it stands to reason that it dose on rdna3 too.
There was a problem hiding this comment.
V_WMMA_F32_16X16X16_F16 does better on RDNA4 because hipBLASLt has support for it and hipBLASLt is default rocBLAS backend for non-batched and strided batched GEMMs on gfx12. However, Tensile is default backend for gfx11 and perf numbers are worse with V_WMMA_F32_16X16X16_F16 on gfx11
There was a problem hiding this comment.
Is there a plan to fix this rather arbitrary limitation on gfx11 in rocblas/hipblaslt?
There was a problem hiding this comment.
Wonder if that's worth an issue report?
|
Any progress on this PR? |
|
ggml/src/ggml-cuda/ggml-cuda.cu
Outdated
| CUBLAS_CHECK(cublasSetStream(ctx.cublas_handle(id), stream)); | ||
|
|
||
| if (GGML_CUDA_CC_IS_CDNA(cc)) { | ||
| const int compute_capability = ggml_cuda_info().devices[ctx.device].cc; |
There was a problem hiding this comment.
do not repeat a value already available in the function
ggml/src/ggml-cuda/ggml-cuda.cu
Outdated
| CUBLAS_CHECK(cublasSetStream(ctx.cublas_handle(id), stream)); | ||
|
|
||
| if (GGML_CUDA_CC_IS_CDNA(compute_capability)) { | ||
| if (GGML_CUDA_CC_IS_CDNA(compute_capability) || GGML_CUDA_CC_IS_RDNA4(compute_capability)) { |
There was a problem hiding this comment.
Is there a plan to fix this rather arbitrary limitation on gfx11 in rocblas/hipblaslt?
|
Is this supported on Windows? How can I build for gfx1200 on Windows? |
it should theoretically compile fine on Windows. At least it did for RDNA3 with the HIP SDK a while ago.. |
No description provided.